sparse network
- North America > United States (0.04)
- Asia > China > Tianjin Province > Tianjin (0.04)
- Contests & Prizes (0.71)
- Research Report > New Finding (0.48)
- North America > Canada > Ontario > Toronto (0.04)
- Asia > Middle East > Jordan (0.04)
- South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
- Overview (1.00)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.93)
- Information Technology (0.92)
- Leisure & Entertainment > Games > Computer Games (0.46)
Graph energy as a measure of community detectability in networks
Böttcher, Lucas, Porter, Mason A., Fortunato, Santo
A key challenge in network science is the detection of communities, which are sets of nodes in a network that are densely connected internally but sparsely connected to the rest of the network. A fundamental result in community detection is the existence of a nontrivial threshold for community detectability on sparse graphs that are generated by the planted partition model (PPM). Below this so-called ``detectability limit'', no community-detection method can perform better than random chance. Spectral methods for community detection fail before this detectability limit because the eigenvalues corresponding to the eigenvectors that are relevant for community detection can be absorbed by the bulk of the spectrum. One can bypass the detectability problem by using special matrices, like the non-backtracking matrix, but this requires one to consider higher-dimensional matrices. In this paper, we show that the difference in graph energy between a PPM and an Erdős--Rényi (ER) network has a distinct transition at the detectability threshold even for the adjacency matrices of the underlying networks. The graph energy is based on the full spectrum of an adjacency matrix, so our result suggests that standard graph matrices still allow one to separate the parameter regions with detectable and undetectable communities.
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- North America > United States > Florida > Alachua County > Gainesville (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- (9 more...)
A Theoretical View on Sparsely Activated Networks
Deep and wide neural networks successfully fit very complex functions today, but dense models are starting to be prohibitively expensive for inference. To mitigate this, one promising research direction is networks that activate a sparse subgraph of the network. The subgraph is chosen by a data-dependent routing function, enforcing a fixed mapping of inputs to subnetworks (e.g., the Mixture of Experts (MoE) paradigm in Switch Transformers). However, there is no theoretical grounding for these sparsely activated models. As our first contribution, we present a formal model of data-dependent sparse networks that captures salient aspects of popular architectures.
Channel Permutations for N:M Sparsity
We introduce channel permutations as a method to maximize the accuracy of N:M sparse networks. N:M sparsity requires N out of M consecutive elements to be zero and has been shown to maintain accuracy for many models and tasks with a simple prune and fine-tune workflow. By permuting weight matrices along their channel dimension and adjusting the surrounding layers appropriately, we demonstrate accuracy recovery for even small, parameter-efficient networks, without affecting inference run-time. We also present both a quality metric to simplify judging permutations as well as efficient methods to search for high-quality permutations, including two optimizations to escape local minima. Finally, we share an ablation study to show the importance of each part of our search algorithm, experimental results showing correlation between our quality metric and final network accuracy, improved sparse network accuracy using our techniques with insignificant overhead to training time, and the transformation of unstructured to structured sparse workloads.
On the Stability of the Jacobian Matrix in Deep Neural Networks
Dadoun, Benjamin, Hayou, Soufiane, Salam, Hanan, Seddik, Mohamed El Amine, Youssef, Pierre
Deep neural networks are known to suffer from exploding or vanishing gradients as depth increases, a phenomenon closely tied to the spectral behavior of the input-output Jacobian. Prior work has identified critical initialization schemes that ensure Jacobian stability, but these analyses are typically restricted to fully connected networks with i.i.d. weights. In this work, we go significantly beyond these limitations: we establish a general stability theorem for deep neural networks that accommodates sparsity (such as that introduced by pruning) and non-i.i.d., weakly correlated weights (e.g. induced by training). Our results rely on recent advances in random matrix theory, and provide rigorous guarantees for spectral stability in a much broader class of network models. This extends the theoretical foundation for initialization schemes in modern neural networks with structured and dependent randomness.
- Europe > Austria > Vienna (0.14)
- North America > United States > New York (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- (3 more...)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Europe > Denmark > Capital Region > Kongens Lyngby (0.04)
- Telecommunications > Networks (0.34)
- Information Technology > Networks (0.34)